Phylogenetic approaches to microbial community classification

نویسندگان

  • Jie Ning
  • Robert G. Beiko
چکیده

BACKGROUND The microbiota from different body sites are dominated by different major groups of microbes, but the variations within a body site such as the mouth can be more subtle. Accurate predictive models can serve as useful tools for distinguishing sub-sites and understanding key organisms and their roles and can highlight deviations from expected distributions of microbes. Good classification depends on choosing the right combination of classifier, feature representation, and learning model. Machine-learning procedures have been used in the past for supervised classification, but increased attention to feature representation and selection may produce better models and predictions. RESULTS We focused our attention on the classification of nine oral sites and dental plaque in particular, using data collected from the Human Microbiome Project. A key focus of our representations was the use of phylogenetic information, both as the basis for custom kernels and as a way to represent sets of microbes to the classifier. We also used the PICRUSt software, which draws on phylogenetic relationships to predict molecular functions and to generate additional features for the classifier. Custom kernels based on the UniFrac measure of community dissimilarity did not improve performance. However, feature representation was vital to classification accuracy, with microbial clade and function representations providing useful information to the classifier; combining the two types of features did not yield increased prediction accuracy. Many of the best-performing clades and functions had clear associations with oral microflora. CONCLUSIONS The classification of oral microbiota remains a challenging problem; our best accuracy on the plaque dataset was approximately 81 %. Perfect accuracy may be unattainable due to the close proximity of the sites and intra-individual variation. However, further exploration of the space of both classifiers and feature representations is likely to increase the accuracy of predictive models.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Phylogenetic distance in Great Salt Lake microbial communities

Investigations of community composition often rely on metrics based on the abundance of taxonomic groups to estimate biodiversity. Although traditional measures of biodiversity, such as richness and evenness, can be used in a comparative fashion to evaluate differences among communities in both temporal and spatial contexts, these measures generally omit a phylogenetic perspective of the evolut...

متن کامل

Analysis of Microbial Communities by Functional Gene Arrays

A major hurdle to the study of microbial communities is that only about 1% of microorganisms are cultivated (Whitman et al. 1998). As such, cultureindependent approaches are necessary in order to examine the vast majority of environmental microorganisms. Many molecular techniques are available for community analysis, and most of these techniques utilize phylogenetic markers such as the 16S rRNA...

متن کامل

Comparative Analysis of Pyrosequencing and a Phylogenetic Microarray for Exploring Microbial Community Structures in the Human Distal Intestine

BACKGROUND Variations in the composition of the human intestinal microbiota are linked to diverse health conditions. High-throughput molecular technologies have recently elucidated microbial community structure at much higher resolution than was previously possible. Here we compare two such methods, pyrosequencing and a phylogenetic array, and evaluate classifications based on two variable 16S ...

متن کامل

Microarrays for bacterial detection and microbial community analysis.

Several types of microarrays have recently been developed and evaluated for bacterial detection and microbial community analysis. These studies demonstrated that specific, sensitive and quantitative detection could be obtained with both functional gene arrays and community genome arrays. Although single-base mismatch can be differentiated with phylogenetic oligonucleotide arrays, reliable speci...

متن کامل

A New Approach for Scalable Analysis of Microbial Communities

Motivation: Microbial communities play important roles in the function and maintenance of various biosystems, ranging from human body to the environment. Current methods for analysis of microbial communities are typically based on taxonomic phylogenetic alignment using 16S rRNA metagenomic or Whole Genome Sequencing data. In typical characterizations of microbial communities, studies deal with ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 3  شماره 

صفحات  -

تاریخ انتشار 2015